{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Using Opik with Haystack\n",
    "\n",
    "[Haystack](https://docs.haystack.deepset.ai/docs/intro) is an open-source framework for building production-ready LLM applications, retrieval-augmented generative pipelines and state-of-the-art search systems that work intelligently over large document collections.\n",
    "\n",
    "In this guide, we will showcase how to integrate Opik with Haystack so that all the Haystack calls are logged as traces in Opik."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating an account on Comet.com\n",
    "\n",
    "[Comet](https://www.comet.com/site?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) provides a hosted version of the Opik platform, [simply create an account](https://www.comet.com/signup?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) and grab your API Key.\n",
    "\n",
    "> You can also run the Opik platform locally, see the [installation guide](https://www.comet.com/docs/opik/self-host/overview/?from=llm&utm_source=opik&utm_medium=colab&utm_content=haystack&utm_campaign=opik) for more information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install --upgrade --quiet opik haystack-ai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import opik\n",
    "\n",
    "opik.configure(use_local=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import getpass\n",
    "\n",
    "if \"OPENAI_API_KEY\" not in os.environ:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI API key: \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating the Haystack pipeline\n",
    "\n",
    "In this example, we will create a simple pipeline that uses a prompt template to translate text to German.\n",
    "\n",
    "To enable Opik tracing, we will:\n",
    "1. Enable content tracing in Haystack by setting the environment variable `HAYSTACK_CONTENT_TRACING_ENABLED=true`\n",
    "2. Add the `OpikConnector` component to the pipeline\n",
    "\n",
    "Note: The `OpikConnector` component is a special component that will automatically log the traces of the pipeline as Opik traces, it should not be connected to any other component."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"HAYSTACK_CONTENT_TRACING_ENABLED\"] = \"true\"\n",
    "\n",
    "from haystack import Pipeline\n",
    "from haystack.components.builders import ChatPromptBuilder\n",
    "from haystack.components.generators.chat import OpenAIChatGenerator\n",
    "from haystack.dataclasses import ChatMessage\n",
    "\n",
    "from opik.integrations.haystack import OpikConnector\n",
    "\n",
    "\n",
    "pipe = Pipeline()\n",
    "\n",
    "# Add the OpikConnector component to the pipeline\n",
    "pipe.add_component(\"tracer\", OpikConnector(\"Chat example\"))\n",
    "\n",
    "# Continue building the pipeline\n",
    "pipe.add_component(\"prompt_builder\", ChatPromptBuilder())\n",
    "pipe.add_component(\"llm\", OpenAIChatGenerator(model=\"gpt-3.5-turbo\"))\n",
    "\n",
    "pipe.connect(\"prompt_builder.prompt\", \"llm.messages\")\n",
    "\n",
    "messages = [\n",
    "    ChatMessage.from_system(\n",
    "        \"Always respond in German even if some input data is in other languages.\"\n",
    "    ),\n",
    "    ChatMessage.from_user(\"Tell me about {{location}}\"),\n",
    "]\n",
    "\n",
    "response = pipe.run(\n",
    "    data={\n",
    "        \"prompt_builder\": {\n",
    "            \"template_variables\": {\"location\": \"Berlin\"},\n",
    "            \"template\": messages,\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "trace_id = response[\"tracer\"][\"trace_id\"]\n",
    "print(f\"Trace ID: {trace_id}\")\n",
    "print(response[\"llm\"][\"replies\"][0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The trace is now logged to the Opik platform:\n",
    "\n",
    "![Haystack trace](https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/fern/img/cookbook/haystack_trace_cookbook.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Advanced usage\n",
    "\n",
    "### Ensuring the trace is logged\n",
    "\n",
    "By default the `OpikConnector` will flush the trace to the Opik platform after each component in a thread blocking way. As a result, you may disable flushing the data after each component by setting the `HAYSTACK_OPIK_ENFORCE_FLUSH` environent variable to `false`.\n",
    "\n",
    "**Caution**: Disabling this feature may result in data loss if the program crashes before the data is sent to Opik. Make sure you will call the `flush()` method explicitly before the program exits:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from haystack.tracing import tracer\n",
    "\n",
    "tracer.actual_tracer.flush()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Getting the trace ID\n",
    "\n",
    "If you would like to log additional information to the trace you will need to get the trace ID. You can do this by the `tracer` key in the response of the pipeline:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = pipe.run(\n",
    "    data={\n",
    "        \"prompt_builder\": {\n",
    "            \"template_variables\": {\"location\": \"Berlin\"},\n",
    "            \"template\": messages,\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "trace_id = response[\"tracer\"][\"trace_id\"]\n",
    "print(f\"Trace ID: {trace_id}\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py312_llm_eval",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
